Points of View

نویسندگان

  • MICHAEL S. ROSENBERG
  • SUDHIR KUMAR
چکیده

Taxon sampling is often thought to be of extreme importance for phylogenetic inference, and increased sampling of taxa is commonly advocated as a solution to resolving problematic phylogenies. Another solution is to increase the number of sites (by sequencing additional genes) sampled for each taxon. In an ideal world, one would like to increase samples of both taxa and genes, but taxon sampling has not kept up with the pace of gene sampling increase because of the increasing ease and emphasis on genome sequencing. The question of taxon sampling is necessarily driven by resource limitation. The precise scope of “sufficient” taxon sampling is always dependent on questions being addressed. If we need to know the complete phylogeny of a genus, we must sample the genus exhaustively. In experimental design, partial sampling is an issue only when certain taxa can stand as proxies for the clades to which they belong (clade-based or stratified sampling; see Hillis, 1998). In bioinformatics studies, taxon sampling is restricted by the data availability in genetic databases (database-restricted sampling). Clearly, the nature of the problem in these two research programs is different. In stratified sampling, we are interested in knowing whether to sequence more genes per species or fewer genes for a large number of species per clade. In contrast, in database-restricted sampling it is important to know whether the overall accuracy of inferred phylogenetic trees for small taxa sets is similar to that of trees inferred from larger taxa sets. We recently addressed the issue of the database-restricted sampling (Rosenberg and Kumar, 2001) and concluded that although there was a consistent decrease in error when using more taxa, the decrease was generally minor relative to the number of taxa added to the data set. Pollock et al. (2002) challenged this conclusion by modifying our measure of the phylogenetic error. This measure,1E , differs from ours in that we used the difference in error between the subsampled tree [ES] and full sampled tree [EP ], whereas Pollock et al. (2002) divided this difference by ES to measure the relative reduction in error. 1E plotted against the number of additional taxa in the full sampled tree (=66 minus the number of taxa in the subsample tree) shows a clear positive effect (Pollock et al., 2002: Figs. 4, 5). Unfortunately, this impressive result brings little biological benefit, as clearly shown by a scatterplot of the average number of additional branches inferred correctly in each case (Fig. 1). In no instance are there more than 1.5 additional branches reconstructed correctly, even though the number of taxa has often increased many fold. For instance, more than doubling the number of taxa only led to an average increase of 0.7 additional correct branches (points in the middle of the x-axis in Fig. 1). This fact was clearly noted in our original article: “Note that even though ES is greater than EG and EP for very small subsamples (<10 taxa), the difference in phylogenetic error is usually much smaller than one branch per tree” (Rosenberg and Kumar, 2001: 10754). Therefore, although an increase in the number of taxa sampled will lead to improvement in accuracy, the improvement is minimal, particularly when we consider the amount of data (in terms of the number of total nucleotides) being added. We do not advocate using fewer taxa when more are available, as is clear from the results presented by Rosenberg and Kumar (2001:10754). Zwickl and Hillis (2002) also challenged conclusions reached by Rosenberg and Kumar (2001) by using the concept of tree diameter (the maximum distance between all pairs of taxa) to partition genes with different subsampled sets of taxa for analysis. They showed that fourtaxon subsamples with a smaller tree diameter generate more accurate results than those subsamples with larger tree diameters. This result is expected because, with sequence divergence and length kept constant, the larger diameter four-taxon trees will encompass higher average divergence and would thus involve larger estimation errors. Furthermore, for the simulations involving the model tree in Figure 2a, four-taxon data sets containing sequences with larger diameters would include interordinal relationships (with many small interior branches) more frequently than would small diameter samples (see also Zwickl and Hillis, 2002: Fig. 3a). Therefore, Zwickl and Hillis’s study is an examination of the phylogenetic error at different evolutionary divergence cross sections of the phylogenetic tree specifically simulated. This and the complete absence of resource limitation (a must for any sampling study) clearly establish that Zwickl and Hillis have not evaluated either stratified or databaserestricted taxon-sampling problems. Therefore, Zwickl and Hillis were not correct in stating that their results are in contradiction with our previous results (Rosenberg

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring expectation gap among independent auditors' points of view and university students about importance of fraud risk components

The purpose of this study is exploring expectation gap among university students and auditors points of view about importance of fraud risk components. To get this purpose, university students' ideas and auditors about importance of each mentioned fraud risk components in Iranian auditing standard No. 24 under the title of "the auditor’s responsibilities relating to fraud in an audit of financi...

متن کامل

An analysis of the concept and example of Kowsar and Abtar in the Commentary of Al-Mizan

Despite presenting approximately 26 examples for "Kawthar" in various interpretations, Seyyed Mohammad Hossein Tabatabai has considered the multiplicity of descendants of the Holy Prophet (PBUH) as the exclusive example of this verse. A review of the evidence shows that this view, despite its many positive points, suffers some ambiguities. Therefore, the scientific importance of Al-Mizan interp...

متن کامل

An Investigation of the Concept of an Ideal Family from Students’ Points of View

During the last decades many governments have used formal education as a fundamental instrument for educating children and teenagers in order to promote their personal and inter-personal skills and alleviate family-related problems. Nevertheless, due attention has not been paid to this vital issue in the Iranian education system and even Fundamental Reform Document of Education has emphasized a...

متن کامل

Comparative study of the points of view of undergraduate and postgraduate about the teaching quality of faculty members in Mashhad University of Medical Sciences

BACKGROUND AND OBJECTIVE: The quality of teaching is one of the most important indicators of decision-making regarding promotion of academic rank, basic promotion, extension of contract, conversion of faculty members' employment status in different medical schools. Appropriate evaluation of teaching quality can be an effective step in making the right decisions about the mention processes. This...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003